Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Android malware family classification method based on code image integration

Mo LI, Tianliang LU, Ziheng XIE

Journal of Computer Applications 2022, 42 (5): 1490-1499. DOI: 10.11772/j.issn.1001-9081.2021030486

Abstract （468）

HTML （22）

PDF （3025KB）（159）

Save

Code visualization technology is rapidly popularized in the field of Android malware research once it was proposed. Aiming at the problem of insufficient representation ability of code image converted from single DEX （classes.dex） file， a new Android malware family classification method based on code image integration was proposed. Firstly， the DEX， XML （androidManifest.xml） and decompiled JAR （classes.jar） files in the Android application package were converted to three gray-scale images， and the Bilinear interpolation algorithm was used for the scaling of gray images in different sizes. Then， the three gray-scale images were integrated into a three-dimensional Red-Green-Blue （RGB） image for training and classification. In terms of classification model， the Soft Threshold （ST） Block+ResNeSt（STResNeSt） was proposed by combining the soft threshold denoising block with Split-Attention based ResNeSt. The proposed model has the strong anti-noise ability and is able to pay more attention to the important features of code image. To handle the long-tail distribution of data in the training process， Class Balance Loss （CB Loss） was introduced after data augmentation， which provided a feasible solution to the over-fitting caused by the imbalance of samples. On the Drebin dataset， the accuracy of integrated code image is 2.93 percentage points higher than that of DEX gray-scale image， the accuracy of STResNeSt is improved by 1.1 percentage points compared with the Residual Neural Network （ResNet）， the scheme of data augmentation combined with CB Loss improves the F1 score by up to 2.4 percentage points. Experimental results show that， the average classification accuracy of the proposed method reaches 98.97%， which can effectively classify the Android malware family.

Table and Figures | Reference | Related Articles | Metrics

Select

Tor website traffic analysis model based on self-attention mechanism and spatiotemporal features

Rongkang XI, Manchun CAI, Tianliang LU, Yanlin LI

Journal of Computer Applications 2022, 42 (10): 3084-3090. DOI: 10.11772/j.issn.1001-9081.2021081452

Abstract （438）

HTML （14）

PDF （2633KB）（170）

Save

The onion router （Tor） anonymous communication system is used by criminals to engage in criminal activities on the dark networks， which brings severe challenges to social security. Tor website traffic is captured and analyzed by Tor website traffic analysis technology and therefore illegal behaviors hidden on the internet are timely discovered to conduct network supervision. Based on this， a Tor website traffic analysis model based on Self-Attention and Hierarchical SpatioTemporal （SA-HST） features was proposed on the basis of self-attention mechanism and spatiotemporal features. Firstly， attention mechanism was introduced to assign different weights to the network traffic features to highlight the important features. Then， Convolutional Neural Network （CNN） with multi-channel parallel structure and Long Short-Term Memory （LSTM） network were used to extract the spatiotemporal features of input data. Finally， Softmax function was used to classify data. SA-HST can achieve 97.14% accuracy in closed world scenario， which is 8.74 percentage points and 7.84 percentage points higher compared to CUMUL（CUMULative sum fingerprinting） model and deep learning model CNN. In open world scenario， SA-HST has the evaluation indicators of confusion matrix above 96% stably. Experimental results show that self-attention mechanism can achieve efficient feature extraction under lightweight model structure. By capturing important， multi-view spatiotemporal features of anonymous traffic for classification， SA-HST has certain advantages in terms of classification accuracy， training efficiency and robustness.

Table and Figures | Reference | Related Articles | Metrics